Project-Team:DOLPHIN

Inria | Raweb 2015 | Presentation of the Project-Team DOLPHIN | DOLPHIN Web Site


	PDF	e-Pub

Previous |

Home | Next next

Section: New Results

Parallel B&B revisited for coprocessors using our new IVM data structure dedicated to permutation problems

Participants: J. Gmys, R. Leroy and N. Melab

This contribution is a joint work with M. Mezmaz and D. Tuyttens from University of Mons (UMONS).

Solving large permutation Combinatorial Optimization Problems (COPs) using Branch-and-Bound (B&B) algorithms results in the generation of a very large pool of subproblems. Therefore, defining a dedicated data structure is crucial to store and manage efficiently that pool. In the Ph.D thesis of R. Leroy [11] , we have proposed an original data structure called Integer-Vector-Matrix (IVM) for permutation COPs based on the factorial number system. Consequently, we have redefined the operators of the B&B algorithm acting on it. For performance evaluation in terms of memory footprint and CPU time usage, we conduct a complexity analysis and an extensive experimentation using the permutation Flow-Shop Scheduling Problem (FSP) as a case study. Compared to the Head-Tail Linked List (LL) data structure often used for parallel B&B as in our work [11] , IVM requires up to $n$ times less memory than LL, $n$ being the size of permutations. Moreover, the IVM-based B&B is up to one order of magnitude faster than its LL-based counterpart in managing the pool of subproblems. Another major contribution of this thesis is to revisit parallel B&B for multi-core processors and many-core coprocessors (GPU and MIC) using IVM and LL-based work stealing. Several challenging issues are addressed including work distribution using factoradic-based intervals on multi-core processors, thread/branch divergence and data placement optimization on GPU, and vectorization on Intel Xeon Phi. The contribution and some of its extensions have been published in [40] , [18] . An extensive experimental study shows that the IVM-based approach outperforms its LL-based counterpart by a significant margin on multi-core processors as well as on coprocessors.

A major extension of this work has been proposed in [54] and awarded as a best paper consists in offloading all the operators of the B&B algorithm to the GPU. Four interval-based WS strategies have been investigated using IVM. An extensive experimentation allowed us to demontrate that the GPU-accelerated approach is 5 times faster than its multi-core counterpart.

Previous |

Home | Next next